The cell below contains a number of helper functions used throughout this walkthrough. They are mainly wrappers around existing matplotlib
functionality and are provided for the sake of simplicity in the steps to come.
Take a moment to read the descriptions for each method so you understand what they can be used for. You will use these "helper methods" as you work through this notebook below.
If you are familiar with matplotlib
, feel free to alter the functions as you please.
In [ ]:
# TODO: Make sure you run this cell before continuing!
%matplotlib inline
import matplotlib.pyplot as plt
def show_plot(x_data, y_data, x_label, y_label):
"""
Display a simple line plot.
:param x_data: Numpy array containing data for the X axis
:param y_data: Numpy array containing data for the Y axis
:param x_label: Label applied to X axis
:param y_label: Label applied to Y axis
"""
plt.figure(figsize=(10,5), dpi=100)
plt.plot(x_data, y_data, 'b-', marker='|', markersize=2.0, mfc='b')
plt.grid(b=True, which='major', color='k', linestyle='-')
plt.xlabel(x_label)
plt.ylabel (y_label)
plt.show()
def plot_box(bbox):
"""
Display a Green bounding box on an image of the blue marble.
:param bbox: Shapely Polygon that defines the bounding box to display
"""
min_lon, min_lat, max_lon, max_lat = bbox.bounds
import matplotlib.pyplot as plt1
from matplotlib.patches import Polygon
from mpl_toolkits.basemap import Basemap
map = Basemap()
map.bluemarble(scale=0.5)
poly = Polygon([(min_lon,min_lat),(min_lon,max_lat),(max_lon,max_lat),(max_lon,min_lat)],facecolor=(0,0,0,0.0),edgecolor='green',linewidth=2)
plt1.gca().add_patch(poly)
plt1.gcf().set_size_inches(10,15)
plt1.show()
def show_plot_two_series(x_data_a, x_data_b, y_data_a, y_data_b, x_label, y_label_a, y_label_b, series_a_label, series_b_label):
"""
Display a line plot of two series
:param x_data_a: Numpy array containing data for the Series A X axis
:param x_data_b: Numpy array containing data for the Series B X axis
:param y_data_a: Numpy array containing data for the Series A Y axis
:param y_data_b: Numpy array containing data for the Series B Y axis
:param x_label: Label applied to X axis
:param y_label_a: Label applied to Y axis for Series A
:param y_label_b: Label applied to Y axis for Series B
:param series_a_label: Name of Series A
:param series_b_label: Name of Series B
"""
fig, ax1 = plt.subplots(figsize=(10,5), dpi=100)
series_a, = ax1.plot(x_data_a, y_data_a, 'b-', marker='|', markersize=2.0, mfc='b', label=series_a_label)
ax1.set_ylabel(y_label_a, color='b')
ax1.tick_params('y', colors='b')
ax1.set_ylim(min(0, *y_data_a), max(y_data_a)+.1*max(y_data_a))
ax1.set_xlabel(x_label)
ax2 = ax1.twinx()
series_b, = ax2.plot(x_data_b, y_data_b, 'r-', marker='|', markersize=2.0, mfc='r', label=series_b_label)
ax2.set_ylabel(y_label_b, color='r')
ax2.set_ylim(min(0, *y_data_b), max(y_data_b)+.1*max(y_data_b))
ax2.tick_params('y', colors='r')
plt.grid(b=True, which='major', color='k', linestyle='-')
plt.legend(handles=(series_a, series_b), bbox_to_anchor=(1.1, 1), loc=2, borderaxespad=0.)
plt.show()
Now we can interact with NEXUS using the nexuscli
python module. The nexuscli
module has a number of useful methods that allow you to easily interact with the NEXUS webservice API. One of those methods is nexuscli.dataset_list
which returns a list of Datasets in the system along with their start and end times.
However, in order to use the client, it must be told where the NEXUS webservice is running. The nexuscli.set_target(url)
method is used to target NEXUS. An instance of NEXUS is already running for you and is available at http://nexus-webapp:8083
.
nexuscli
python module.nexuscli.dataset_list()
and print the results
In [ ]:
# TODO: Import the nexuscli python module.
# Target the nexus webapp server
nexuscli.set_target("http://nexus-webapp:8083")
# TODO: Call nexuscli.dataset_list() and print the results
Now that we can interact with NEXUS using the nexuscli
python module, we would like to run a time series. To do this, we will use the nexuscli.time_series
method. The signature for this method is described below:
nexuscli.time_series(datasets, bounding_box, start_datetime, end_datetime, spark=False)
Send a request to NEXUS to calculate a time series.
datasets Sequence (max length 2) of the name of the dataset(s)
bounding_box Bounding box for area of interest as ashapely.geometry.polygon.Polygon
start_datetime Start time as adatetime.datetime
end_datetime End time as adatetime.datetime
spark Optionally use spark. Default:False
return List ofnexuscli.nexuscli.TimeSeries
namedtuples ```
As you can see, there are a number of options available. Let's try investigating The Blob in the Pacific Ocean. The Blob is an abnormal warming of the Sea Surface Temperature that was first observed in 2013.
Generate a time series for the AVHRR_OI_L4_GHRSST_NCEI
SST dataset for the time period 2013-01-01 through 2014-03-01 and a bounding box -150, 40, -120, 55
(west, south, east, north).
box
methodplot_box
helper methodtime_series
method in the nexuscli
moduledatetime
is already imported for you. You can create a datetime
using the method datetime(int: year, int: month, int: day)
spark=True
to the time_series
function to speed up the computationshow_plot
helper method
In [ ]:
import time
import nexuscli
from datetime import datetime
from shapely.geometry import box
# TODO: Create a bounding box using the box method imported above
# TODO: Plot the bounding box using the helper method plot_box
In [ ]:
# Do not modify this line ##
start = time.perf_counter()#
############################
# TODO: Call the time_series method for the AVHRR_OI_L4_GHRSST_NCEI dataset using
# your bounding box and time period 2013-01-01 through 2014-03-01
# Enter your code above this line
print("Time Series took {} seconds to generate".format(time.perf_counter() - start))
In [ ]:
# TODO: Plot the result using the `show_plot` helper method
Now that you have successfully generated a time series for approximately one year of data. Try generating a longer time series by increasing the end date to 2016-12-31
. This will take a little bit longer to execute, since there is more data to analyze, but should finish in under a minute.
The significant increase in sea surface temperature due to the blob should be visible as an upward trend between 2013 and 2015 in this longer time series.
2013-01-01
to 2016-12-31
show_plot
helper method. Make sure you pass spark=True
to the time_series function to speed up the analysisnumpy
and scipy
packages are installed and can be used by importing them: import numpy
or import scipy
matplotlib
has a built in function capable of doing this: matplotlib.dates.date2num
and it's inverse matplotlib.dates.num2date
In [ ]:
import time
import nexuscli
from datetime import datetime
from shapely.geometry import box
bbox = box(-150, 40, -120, 55)
plot_box(bbox)
# Do not modify this line ##
start = time.perf_counter()#
############################
# TODO: Call the time_series method for the AVHRR_OI_L4_GHRSST_NCEI dataset using
# your bounding box and time period 2013-01-01 through 2016-12-31
# Make sure you pass spark=True to the time_series function to speed up the analysis
# Enter your code above this line
print("Time Series took {} seconds to generate".format(time.perf_counter() - start))
In [ ]:
# TODO: Plot the result using the `show_plot` helper method
The time_series
method can be used on up to two datasets at one time for comparison. Let's take a look at another region and see how to generate two time series and plot them side by side.
Hurricane Katrina passed to the southwest of Florida on Aug 27, 2005. The ocean response in a 1 x 1 degree region is captured by a number of satellites. The initial ocean response was an immediate cooling of the surface waters by 2 degrees Celcius that lingers for several days. The SST drop is correlated to both wind and precipitation data.
A study of a Hurricane Katrina–induced phytoplankton bloom using satellite observations and model simulations Xiaoming Liu, Menghua Wang, and Wei Shi1 JOURNAL OF GEOPHYSICAL RESEARCH, VOL. 114, C03023, doi:10.1029/2008JC004934, 2009 http://shoni2.princeton.edu/ftp/lyo/journals/Ocean/phybiogeochem/Liu-etal-KatrinaChlBloom-JGR2009.pdf
Plot the time series for the AVHRR_OI_L4_GHRSST_NCEI
SST dataset and the TRMM_3B42_daily
Precipitation dataset for the region -84.5, 23.5, -83.5, 24.5
and time frame of 2005-08-24
through 2005-09-10
. Plot the result using the show_plot_two_series
helper method and see if you can recognize the correlation between the spike in precipitation and the decrease in temperature.
-84.5, 23.5, -83.5, 24.5
)time_series
method in the nexuscli
moduleshow_plot_two_series
helper method
In [ ]:
import time
import nexuscli
from datetime import datetime
from shapely.geometry import box
# TODO: Create a bounding box using the box method imported above
# TODO: Plot the bounding box using the helper method plot_box
In [ ]:
# Do not modify this line ##
start = time.perf_counter()#
############################
# TODO: Call the time_series method for the AVHRR_OI_L4_GHRSST_NCEI dataset and the `TRMM_3B42_daily` dataset
# using your bounding box and time period 2005-08-24 through 2005-09-10
# Enter your code above this line
print("Time Series took {} seconds to generate".format(time.perf_counter() - start))
In [ ]:
# TODO: Plot the result using the `show_plot_two_series` helper method
Let's return to The Blob region. But this time we're going to use a different calculation, Daily Difference Average (aka. Anomaly plot).
The Daily Difference Average algorithm compares a dataset against a climatological mean and produces a time series of the difference from that mean. Given The Blob region, we should expect to see a positive difference from the mean temperature in that region (indicating higher temperatures than normal) between 2013 and 2014.
This time, using the nexuscli
module, call the daily_difference_average
method. The signature for that method is reprinted below:
Generate an anomaly Time series for a given dataset, bounding box, and timeframe.
dataset Name of the dataset as a String
bounding_box Bounding box for area of interest as ashapely.geometry.polygon.Polygon
start_datetime Start time as adatetime.datetime
end_datetime End time as adatetime.datetime
return List ofnexuscli.nexuscli.TimeSeries
namedtuples
Generate an anomaly time series using the AVHRR_OI_L4_GHRSST_NCEI
SST dataset for the time period 2013-01-01 through 2016-12-31 and a bounding box -150, 40, -120, 55
(west, south, east, north).
daily_difference_average
method in the nexuscli
moduleshow_plot
helper method
In [ ]:
import time
import nexuscli
from datetime import datetime
from shapely.geometry import box
bbox = box(-150, 40, -120, 55)
plot_box(bbox)
# Do not modify this line ##
start = time.perf_counter()#
############################
# TODO: Call the daily_difference_average method for the AVHRR_OI_L4_GHRSST_NCEI dataset using
# your bounding box and time period 2013-01-01 through 2016-12-31. Be sure to pass spark=True as a parameter
# to speed up processing.
# Enter your code above this line
print("Daily Difference Average took {} seconds to generate".format(time.perf_counter() - start))
In [ ]:
# TODO: Plot the result using the `show_plot` helper method
You have finished this workbook.
If others are still working, please feel free to modify the examples and play with the client module or go back and complete the "Advanced" challenges if you skipped them. Further technical information about NEXUS can be found in the GitHub repository.
If you would like to save this notebook for reference later, click on File -> Download as...
and choose your preferred format.